Stennis  Space  Center,  MS  39529-5004 


Naval  Research  Laboratory 


NRL/MR/7440-08-9126 


An  Interactive  Parallel  Coordinates 
Technique  Applied  to  a  Tropical 
Cyclone  Climate  Analysis 


Chad  A.  Steed 

Mapping,  Charting,  and  Geodesy  Branch 
Marine  Geosciences  Division 

Patrick  J.  Fitzpatrick 

Northern  Gulf  Institute ,  Mississippi  State  University 
Stennis  Space  Center,  Mississippi 

T.J.  Jankun-Kelly 
J.  Edward  Swan  II 

Department  of  Computer  Science 
Mississippi  State  University,  Mississippi 

Amber  N.  Yancey 

Department  of  Physics  and  Astronomy 
Mississippi  State  University,  Mississippi 


June  6,  2008 


Approved  for  public  release;  distribution  is  unlimited. 


REPORT  DOCUMENTATION  PAGE 


Form  Approved 
OMB  No.  0704-0188 


Public  reporting  burden  for  this  collection  of  information  is  estimated  to  average  1  hour  per  response,  including  the  time  for  reviewing  instructions,  searching  existing  data  sources,  gathering  and 
maintaining  the  data  needed,  and  completing  and  reviewing  this  collection  of  information.  Send  comments  regarding  this  burden  estimate  or  any  other  aspect  of  this  collection  of  information,  including 
suggestions  for  reducing  this  burden  to  Department  of  Defense,  Washington  Headquarters  Services,  Directorate  for  Information  Operations  and  Reports  (0704-0188),  1215  Jefferson  Davis  Highway, 
Suite  1204,  Arlington,  VA  22202-4302.  Respondents  should  be  aware  that  notwithstanding  any  other  provision  of  law,  no  person  shall  be  subject  to  any  penalty  for  failing  to  comply  with  a  collection  of 
information  if  it  does  not  display  a  currently  valid  OMB  control  number.  PLEASE  DO  NOT  RETURN  YOUR  FORM  TO  THE  ABOVE  ADDRESS. 


1.  REPORT  DATE  (DD-MM-YYYY) 
06-06-2008 


2.  REPORT  TYPE 

Memorandum  Report 


3.  DATES  COVERED  (From  -  To) 


4.  TITLE  AND  SUBTITLE 

An  Interactive  Parallel  Coordinates  Technique 
Applied  to  a  Tropical  Cyclone  Climate  Analysis 


5a.  CONTRACT  NUMBER 


5b.  GRANT  NUMBER 


5c.  PROGRAM  ELEMENT  NUMBER 


6.  AUTHOR(S) 

Chad  A.  Steed,  Patrick  J.  Fitzpatrick,  T.J.  Jankun-Kelly, 
Amber  N.  Yancey,  and  J.  Edward  Swan  II 


5d.  PROJECT  NUMBER 


5e.  TASK  NUMBER 


5f.  WORK  UNIT  NUMBER 

74-9531-08 


7.  PERFORMING  ORGANIZATION  NAME(S)  AND  ADDRESS(ES) 

Naval  Research  Laboratory 
Marine  Geosciences  Division 
Stennis  Space  Center,  MS  39529-5004 


8.  PERFORMING  ORGANIZATION  REPORT 
NUMBER 


NRL/MR/7440— 06-9 1 26 


9.  SPONSORING  /  MONITORING  AGENCY  NAME(S)  AND  ADDRESS(ES) 

Office  of  Naval  Research 
One  Liberty  Center 
875  North  Randolph  St. 

Arlington,  VA  22203-1995 


10.  SPONSOR  /  MONITOR’S  ACRONYM(S) 

ONR 


11.  SPONSOR  /  MONITOR’S  REPORT 
NUMBER(S) 


12.  DISTRIBUTION  /  AVAILABILITY  STATEMENT 


Approved  for  public  release;  distribution  is  unlimited. 


13.  SUPPLEMENTARY  NOTES 


14.  ABSTRACT 

An  enhanced  interactive  variant  of  the  parallel  coordinates  visualization  technique  is  presented.  An  example  of  its  capabilities  is  demonstrated  on 
a  hurricane  climate  dataset.  Its  capabilities  include  focus+context  filtering,  dynamic  visual  queries  with  sliders,  statistical  displays,  relocatable  axes, 
axis  inversion,  details-on-demand,  a  pop-up  menu  interface,  and  aerial  perspective  shading.  Lurthermore,  parallel  coordinates  can  visually  depict 
the  same  correlations  that  weather  scientists  find  meaningful.  It  is  demonstrated  that  these  interactive  parallel  coordinates  enhancements  provide  a 
deeper  understanding  when  used  in  conjunction  with  traditional  multiple  regression  analysis. 


15.  SUBJECT  TERMS 

Parallel  coordinates 
Exploratory  data  analysis 


Hurricane 
Climate  study 


16.  SECURITY  CLASSIFICATION  OF: 

17.  LIMITATION 

18.  NUMBER 

19a.  NAME  OF  RESPONSIBLE  PERSON 

OF  ABSTRACT 

OF  PAGES 

Chad  Steed 

a.  REPORT 

b.  ABSTRACT 

c.  THIS  PAGE 

UL 

28 

19b.  TELEPHONE  NUMBER  (include  area 

Unclassified 

Unclassified 

Unclassified 

code) 

(228)  688-4558 

Standard  Form  298  (Rev.  8-98) 

Prescribed  by  ANSI  Std.  Z39.18 


CONTENTS 


1  Introduction .  1 

2  Related  Work .  3 

3  Climate  Study  Dataset .  4 

4  A  Dynamic  Interactive  Parallel  Coordinates  Application .  8 

5  Parallel  Coordinates  Validation:  North  Atlantic  Case  Study .  13 

6  Conclusion . 23 

Acknowledgements . 23 

References . 23 


iii 


1  Introduction 


In  climate  studies,  scientists  are  interested  in  discovering  which  environmental 
factors  influence  significant  weather  phenomena.  A  prominent  weather  feature 
is  a  tropical  cyclone,  defined  as  a  warm-core  non-frontal  synoptic-scale  cyclone, 
originating  over  tropical  or  subtropical  waters,  with  organized  thunderstorms 
and  a  closed  surface  wind  circulation.  Tropical  cyclones  begin  as  a  tropical 
depression,  with  sustained  10-meter  winds  less  than  17  ms'1.  Most  intensify 
into  tropical  storms  (sustained  winds  between  17  and  32  ms'1).  56%  of  tropical 
cyclones  reach  winds  of  at  least  33  ms'1,  and  are  then  designated  with  regional 
terms  such  as  hurricanes  in  the  Atlantic  basin,  and  typhoons  in  the  Western 
North  Pacific  Ocean.  When  sustained  10-meter  winds  reach  49  ms'1,  they  are 
called  intense  hurricanes  in  the  Atlantic. 

Tropical  cyclone  activity  in  each  ocean  basin  can  vary  on  a  yearly  scale  as  well 
as  a  multidecadal  scale  clue  to  large-scale  atmospheric  influences  and  climate 
forcing.  As  a  result,  scientists  are  developing  procedures  to  forecast  whether 
an  upcoming  tropical  cyclone  season  will  be  active,  normal,  or  below  normal. 
Others  are  studying  causes  of  multidecadal  cycles,  and  whether  anthropogenic 
global  warming  is  also  an  influence  (Landsea,  2005).  Recent  destructive  trop¬ 
ical  cyclones  seasons  have  escalated  these  research  efforts. 

Several  atmospheric  and  climate  variables  impact  the  intensity  and  frequency 
of  seasonal  storm  activity.  Identifying  the  most  critical  environmental  vari¬ 
ables  help  scientists  generate  more  accurate  seasonal  forecasts  which,  in  turn, 
improve  the  preparedness  of  the  general  public  and  emergency  agencies.  One 
useful  method  for  predicting  and  understanding  the  seasonal  variability  in 
tropical  cyclones  is  multiple  regression.  Predictors  are  chosen  from  historical 
tropical  cyclone  data  (Vitart,  2004),  and  provide  an  ordered  list  of  the  most 
important  predictors  for  the  dynamic  parameters. 

Researchers  can  also  explore  the  relationship  of  one  predictor  using  linear 
regression  and  scatter  plots  (Fig.  1),  or  histograms  which  require  several  sep¬ 
arate  plots  or  layered  plots  to  analyze  multiple  variables.  But,  separate  plots 
are  not  very  effective  when  several  factors  impact  a  dependent  variable.  A 
major  reason  for  their  ineffectiveness  is  because  the  viewer  is  forced  to  search 
for  patterns  across  multiple  images,  resulting  in  a  phenomenon  called  change 
blindness.  Change  blindness  results  in  the  inability  of  the  low-level  human 
perceptual  system  to  recall  detail  outside  the  viewing  area  (Ware,  2004).  Lay¬ 
ered  plots  can  be  used  instead,  but  problems  can  occur  with  the  occlusion  of 
underlying  layers  and  interference  between  the  various  layers  (Healey  et  ah, 
2004).  These  traditional  visualization  techniques  were  not  designed  to  support 
rapid  or  accurate  multidimensional  analysis.  Furthermore,  the  geographically- 
encoded  data  used  in  these  climate  studies  are  usually  displayed  in  the  context 
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r2=1 7%,  y=-50.24+2.54x 


June-July  SST  in  the  Northeastern  Subtropical  Atlantic 


Fig.  1.  A  common  visualization  technique  used  in  climate  studies  is  the  scatter  plot 
overlaid  with  a  linear  regression  line.  This  example  shows  the  linear  relationship 
between  June-July  SST  in  the  northeastern  subtropical  Atlantic  Ocean,  and  the 
number  of  hurricanes  from  1950  to  2006.  The  explained  variance  is  17%. 


of  a  geographical  map;  although  certain  important  patterns  (those  directly  re¬ 
lated  to  geographic  position)  may  be  recognized  in  this  context,  additional 
information  may  be  discovered  more  rapidly  using  non-geographical  informa¬ 
tion  visualization  techniques.  Due  to  the  multivariate  nature  of  climate  study 
data,  researchers  need  visualization  techniques  that  can  accommodate  the  si¬ 
multaneous  display  of  many  variables. 


This  paper  discusses  the  application  and  extension  of  a  popular  multivari¬ 
ate  information  visualization  technique,  parallel  coordinates,  to  a  tropical  cy¬ 
clone  climate  study  and  regression  analysis.  Parallel  coordinates  yields  a  two- 
dimensional  representation  of  a  multidimensional  dataset.  The  n-dimensional 
data  is  represented  as  a  polyline  where  its  n-points  are  connected  in  n  par¬ 
allel  y- axes.  The  resulting  visualization  provides  a  compact  two-dimensional 
representation  of  even  large  multivariate  datasets  (Siirtola,  2000).  Parallel 
coordinates  are  extended  here  with  dynamic  interaction.  This  paper  also  dis¬ 
cusses  how  these  techniques  increase  the  scientists’  ability  to  discover  the 
relationships  between  dependent  and  independent  variables.  Using  a  climate 
study  dataset  that  consists  of  several  seasonal  tropical  cyclone  predictors,  it 
is  shown  that  parallel  coordinates  provides  a  useful  representation  of  multiple 
regression  analysis.  The  results  suggest  that  parallel  coordinates  can  be  used 
as  an  alternative  method  for  finding  relationships  among  a  set  of  variables,  and 
the  technique  can  be  used  in  conjunction  with  stepwise  regression  to  enhance 
and  speed  up  the  relationship  discovery  process. 
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Table  1 

New  interaction  and  representation  features  added  to  the  parallel  coordinates  visu¬ 
alization  technique. 

Focus+Context  Interactively  scales  an  axis  and  zooms 

into  a  subset  of  relations  for  that  axis. 

Aerial  Perspective  Facilitates  visual  queries  by  shading  lines 

based  on  proximity  to  the  mouse  cursor  using  a 
shading  scheme  that  mimics  human  perception. 

Dynamic  Visual  Query  Explores  multidimensional  relationships 

with  double-sided  sliders. 

Statistical  Indicators  Indicates  statistical 

quantities  to  support  interaction  model. 

Relocatable  Axes  Reorganizes  the  axes  by  dragging  with 

the  mouse  to  observe  the  correlation  between 
variables. 

Axis  inversion  Inverts  the  axis  display  scale  by  swapping 

the  top  and  bottom  values. 

Details-on-demand  Shows  additional  details  for  the  highlighted  axis, 

and  displays  the  value  on  the  axis  scale  under  the 
mouse  by  clicking  on  the  axis  with  the 
middle  mouse  button. 

Customizable  Display  Modifies  the  display  (statistics 

display,  color  schemes,  tick  marks)  via  a  pop-up 
menu  interface. 

2  Related  Work 

The  parallel  coordinates  visualization  technique  was  first  introduced  by  In- 
selberg  (1985)  to  represent  hyper-dimensional  geometries.  Later,  Wegman 
(1990)  applied  the  technique  to  the  analysis  of  multivariate  relationships  in 
data.  Since  then,  several  innovative  extensions  to  the  technique  have  been 
described  in  visualization  research  literature.  Hauser  et  al.  (2002)  proposed 
several  brushing  extensions  for  parallel  coordinates.  The  software  described 
in  this  paper  implements  a  variant  of  this  histogram  display  technique  and 
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ascertains  its  usefulness  in  the  statistical  analysis  of  tropical  cyclone  climate 
relationships.  Additionally,  a  dynamic  axis  re-ordering  feature,  axis  inversion 
capability  and  some  details-on-demand  features  similar  to  Hauser  et  al.  (2002) 
have  been  implemented.  Furthermore,  some  interaction  capabilities  of  Siirtola 
(2000)  (e.g.,  conjunctive  queries)  are  added,  as  well  as  a  variant  of  the  interac¬ 
tive  aerial  perspective  shading  technique  of  Jankun- Kelly  and  Waters  (2006). 
The  aerial  perspective  shading  used  in  this  paper  highlights  user-defined  re¬ 
gions  in  the  visualization  using  the  mouse  position  and  query  sliders.  The 
application  also  includes  a  focus+context  technique  for  axis  scaling  (Novotny 
and  Hauser,  2006). 

This  new  software  also  provides  dynamic  query  capabilities  for  the  axes  based 
on  the  double  slider  concept  of  Ahlberg  and  Shneiderman  (1994).  Furthermore, 
the  axes  display  important  frequency  information  between  the  double  slider 
widgets  in  a  manner  similar  to  the  Influence  Explorer  of  Tweedie  et  al.  (1996). 
These  features  are  summarized  in  Table  1. 

Multiple  regression  traditionally  has  been  used  to  identify  statistically  signifi¬ 
cant  variables  from  multivariate  datasets,  including  tropical  cyclones  datasets. 
Klotzbach  et  al.  (2006a)  use  this  technique  to  determine  the  most  important 
variables  for  predicting  the  frequency  of  tropical  cyclone  activity  for  the  North 
Atlantic  basin.  Similarly,  Fitzpatrick  applied  stepwise  regression  analysis  to 
the  prediction  of  tropical  cyclone  intensity  (Fitzpatrick,  1996,  1997).  It  will  be 
shown  that  multiple  regression  and  dynamic  parallel  coordinates  can  compli¬ 
ment  each  other,  with  the  regression  identifying  the  relevant  associations  and 
the  interactive  software  highlighting  additional  features  of  the  variables. 


3  Climate  Study  Dataset 


This  research  analyzes  a  dataset  containing  potential  environmental  predic¬ 
tors  for  a  tropical  cyclone  climate  study.  This  dataset  was  provided  by  the 
Tropical  Meteorology  Project  at  Colorado  State  University  (P.  Klotzbach, 
personal  communication),  and  is  used  to  predict  the  frequency  of  Atlantic 
tropical  cyclones  for  the  upcoming  hurricane  season  by  categories.  These  cat¬ 
egories  include:  1]  number  named  storms  (winds  33  ms-1  or  more,  at  which 
tropical  cyclones  receive  a  “name”);  2]  number  of  hurricanes;  and  3]  number  of 
intense  hurricanes.  These  variables  have  known  relationships  to  Atlantic  trop¬ 
ical  cyclone  activity.  For  example,  the  North  Atlantic  basin  has  fewer  tropical 
cyclones  during  El  Nino  Southern  Oscillation  (ENSO)  years,  and  active  sea¬ 
sons  in  La  Nina  years  (Chu,  2004).  Because  of  this  relationship,  scientists  use 
ENSO  signals  as  some  predictors  of  seasonal  storm  activity.  Scientists  at  the 
Tropical  Meteorology  Project  issue  six  forecast  reports  based  on  statistically 
significant  predictors  from  this  dataset. 
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Table  2.  Environmental  tropical  cyclone  climate  variables  evaluated  as  predictors  in  the  multiple  regression  procedure. 
Variable  Name  Geographical  Region 
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Table  2  lists  16  potential  environmental  predictors  from  the  dataset  along 
with  their  geographical  region.  In  the  remainder  of  this  section,  the  physical 
relationships  of  these  climate  variables  to  Atlantic  tropical  cyclone  activity 
are  discussed. 


3. 1  El  Nino  Variables 


In  a  normal  year,  air  rises  in  the  western  tropical  Pacific  (where  the  water 
is  the  warmest  as  well  as  slightly  elevated)  and  sinks  in  the  eastern  tropical 
Pacific  which  is  a  phenomenon  known  as  the  Walker  Circulation.  During  an  El 
Nino  event,  the  easterly  surface  trade  winds  that  cause  this  water  bulge  in  the 
western  Pacific  weaken,  and  the  warm  water  travels  eastward.  Furthermore, 
El  Nino  conditions  shift  the  upward  portion  of  the  Walker  Circulation  to  the 
eastern  Pacific,  creating  upper-level  westerly  winds  in  the  Atlantic  Ocean  as 
well  as  subsidence.  Both  of  these  factors  inhibit  tropical  cyclone  formation  and 
intensification  in  this  region.  Opposite  conditions  (abnormally  strong  trade 
winds  and  colder  than  normal  eastern  Pacific  water)  are  called  La  Nina.  La 
Nina  years  are  associated  with  weak  wind  shear  and  little  subsidence  in  the 
Atlantic,  typically  producing  active  tropical  cyclone  activity  in  this  basin. 

El  Nino  events  are  characterized  by  several  possible  variables.  The  June- July 
Nino  3  (1)  variable  represents  sea  surface  temperature  (SST)  anomalies  of 
the  eastern  equatorial  tropical  Pacific  Ocean.  Positive  values  of  this  variable 
indicate  an  El  Nino  event,  and  negative  represents  a  La  Nina  event.  May  SST 
in  the  eastern  equatorial  Pacific  (2)  represents  a  similar  relationship.  The  first 
clues  of  an  impending  El  Nino  can  be  detected  in  February  by  observing  three 
variables.  Upper-level  westerly  (zonal)  wind  anomalies  off  the  northeast  coast 
of  South  America  imply  that  the  upward  branch  of  the  Walker  Circulation 
associated  with  ENSO  remains  in  the  western  Pacific  and  that  El  Nino  con¬ 
ditions  are  likely  to  be  present  in  the  eastern  equatorial  Pacific  for  the  next 
4-6  months.  This  situation  is  measured  by  the  February  200-mb  zonal  wind 
(U)  in  equatorial  East  Brazil  (3).  Likewise,  anomalous  late  winter  meridional 
(north)  winds  at  200-mb  in  the  South  Indian  Ocean  are  also  associated  with  El 
Nino  conditions  ( February-March  200-mb  V  in  the  South  Indian  Ocean  (4)). 
Finally,  sea  level  pressure  (SLP)  in  the  eastern  Pacific  south  of  the  equator  is 
a  measure  of  the  trade  winds  whereby  weak  trade  winds  (or  westerly  surface 
winds)  are  associated  with  lower  SLP  and,  therefore,  El  Nino  conditions,  while 
the  opposite  is  correlated  to  La  Nina  conditions.  Therefore,  February  SLP  in 
the  eastern  South  Pacific  (5)  is  a  possible  variable.  Some  Fall  variables  are  also 
correlated  to  El  Nino  conditions,  such  as  the  October-November  SLP  in  the 
Gulf  of  Alaska  (6),  September  500-mb  Geopotential  Height  in  western  North 
America  (7),  and  November  SLP  in  the  subtropical  northeast  Pacific  (8). 
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3.2  Sea  Level  Pressure  Variables 


Pressure  in  the  Atlantic  Ocean  is  also  inversely  related  to  tropical  cyclone  ac¬ 
tivity,  and  seems  to  contain  both  monthly  as  well  as  longer  term  relationships. 
Low  SLP  in  the  tropical  Atlantic  implies  increased  atmospheric  instability, 
moisture,  and  ascent  (more  favorable  for  the  genesis  of  tropical  cyclones),  and 
weaker  trade  winds  (which  correspond  to  less  wind  shear  that  can  tear  up  the 
thunderstorms  in  tropical  cyclones).  Low  SLP  in  the  spring  tends  to  persist 
through  the  summer  and  fall.  Therefore,  potential  variables  include  March- 
April  SLP  in  the  eastern  tropical  Atlantic  (9),  June- July  SLP  in  the  tropical 
Atlantic  (10),  and  September-November  SLP  in  the  southeast  Gulf  of  Mexico 
(11). 


3.3  Teleconnection  Variables 


The  atmosphere  is  characterized  by  long-term  oscillations  which  impact  global 
wind  patterns,  known  as  teleconnections.  Two  of  these  are  the  Arctic  Oscil¬ 
lation  and  the  North  Atlantic  Oscillation.  When  these  oscillations  are  in  one 
phase,  they  cause  more  ridges  in  the  Atlantic,  which  corresponds  to  less  wind 
shear.  Also,  on  decadal  timescales,  weaker  zonal  winds  in  the  sub-polar  ar¬ 
eas  are  indicative  of  a  relatively  strong  thermohaline  circulation  and  therefore 
a  warmer  Atlantic  Ocean.  A  variable  which  measures  this  oscillation  is  the 
November  500-mb  Geopotential  Height  in  the  North  Atlantic  (12). 


3-4  Quasi- Biennial  Oscillation  Variable 


Research  has  also  shown  that  the  Quasi-Biennial  Oscillation  (QBO)  is  corre¬ 
lated  to  tropical  cyclone  activity.  The  QBO  is  a  stratospheric  (16  to  35  km 
altitude)  oscillation  of  equatorial  east-west  winds  which  vary  with  a  period 
of  about  26  to  30  months  or  roughly  2  years.  These  winds  typically  blow  for 
12-16  months  from  the  east,  then  reverse  and  blow  12-16  months  from  the 
west,  then  back  to  easterly  again.  The  west  phase  of  the  QBO  has  been  shown 
to  provide  favorable  conditions  for  development  of  tropical  cyclones,  possibly 
because  it  reduces  wind  shear.  A  variable  which  measures  the  QBO  is  the  July 
50-mb  Equatorial  Wind  (U)  around  the  globe  (13). 
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Fig.  2.  An  annotated  view  of  the  parallel  coordinate  axis  display  widget.  Normally, 
an  axis  is  displayed  using  a  muted  color  scheme  (left).  However,  when  the  mouse 
moves  into  an  axis  space,  the  axis  is  displayed  with  the  highlighted  color  scheme 
(right). 

3.5  Atlantic  Sea  Surface  Temperature  Variables 


The  Atlantic  SST  is  another  major  influence  on  tropical  cyclone  activity  in 
that  basin.  Like  SLP,  winter  and  spring  anomalies  tend  to  persist  throughout 
the  season.  Therefore,  February  SST  off  the  northwest  European  Coast  (14), 
April-May  SST  off  the  northwest  European  Coast  (15),  and  June-July  SST 
in  the  northeast  subtropical  Atlantic  (16)  are  potential  predictors.  In  addition, 
warm  SST  anomalies  also  tend  to  correlate  with  low  SLP. 


4  A  Dynamic  Interactive  Parallel  Coordinates  Application 


To  facilitate  a  deeper  understanding  of  the  climate  data,  a  parallel  coordinates 
application  with  several  interactive  extensions  has  been  developed.  This  ap¬ 
plications’  capabilities  include  focus+context  filtering,  dynamic  visual  queries 
with  sliders,  statistical  displays,  relocatable  axes,  axis  inversion,  details-on 
demand,  a  pop-up  menu  interface,  and  aerial  perspectives. 

The  viewer  is  often  interested  in  grouping  subsets  of  data.  A  method  to  select 
lines  using  sliders  facilitates  this  need  (Siirtola,  2000;  Ahlberg  and  Shneider- 
man,  1994).  As  shown  in  Fig.  2,  each  axis  has  a  pair  of  sliders  which  define  the 


Storm  Year  Named  Storms  Hurricanes  Intense  Hurricanes 
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set  for  the  above  average  range  of  the  Named  Storms  axis  and  the  below  average  range  of  the  Intense  Hurricanes  axis  for  data  between 
years  1950  and  2006.  This  query  reveals  that  only  2  storm  seasons  fulfilled  this  criteria. 


February  SST 


upper 

context 


query 


lower 

context 


13.07 


14.65 


focus 


Fig.  4.  The  axis  bar  is  segmented  into  four  distinct  areas:  the  query  area,  the  focus 
area,  and  an  upper  and  lower  context  area. 

top  and  bottom  range  for  the  query  area.  The  viewer  can  drag  these  sliders  to 
dynamically  adjust  which  lines  are  highlighted.  Lines  within  the  query  area 
are  rendered  with  a  thicker  line  and  a  more  prominent  color  while  the  remain¬ 
ing  lines  are  rendered  with  a  thinner  line  and  shade  of  gray  (more  detail  on 
the  shading  algorithm  is  given  in  Section  4.2).  An  example  of  a  conjunctive 
query  using  the  sliders  is  shown  in  Fig.  3.  In  this  image,  the  sliders  show  only 
two  storm  seasons  had  an  above  average  number  of  named  storms  but  a  be¬ 
low  average  number  of  intense  hurricanes.  In  other  words,  when  many  named 
storms  are  observed,  there  tends  to  be  an  average  or  above  average  number  of 
intense  hurricanes  as  well. 

The  application  also  provides  a  details-on-demand  capability.  The  viewer  can 
click  on  an  axis  with  the  middle  mouse  button  to  display  the  value  on  the 
axis  scale  under  the  mouse  (Hauser  et  ah,  2002).  The  application  also  displays 
values  for  the  top  and  bottom  of  the  focus  area  and  applies  the  highlight  color 
to  the  axis  whose  area  is  intersected  by  the  mouse  cursor.  Furthermore,  the 
application  display  can  be  customized  through  a  pop-up  menu  initiated  by  the 
right  mouse  button.  This  menu  controls  statistics,  color  schemes,  tick  marks, 
and  screen  captures.  These  features  will  now  be  discussed  in  more  detail. 


4-1  Axis  Scaling  (Focus+Context) 


In  displays  where  many  relation  lines  are  shown,  it  is  often  desirable  to  in¬ 
teractively  tunnel  through  the  relations  until  a  smaller  subset  of  the  original 
dataset  is  in  focus.  This  application  allows  the  user  to  modify  the  minimum 
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(a)  (b) 

Fig.  5.  A  screen  shot  of  the  parallel  coordinates  application  before  (a)  and  after 
(b)  scaling  has  been  performed.  In  this  example,  scaling  occurs  by  performing  an 
upward  mouse  wheel  function  in  the  focus  area  of  the  axis  which  moves  the  values 
for  the  top  and  bottom  closer  together,  effectively  stretching  the  display  upward 
and  downward  (with  the  base  of  the  display  fixed). 

and  maximum  values  of  the  axes  using  the  mouse  wheel.  On  the  axis  bar,  there 
are  three  distinct  areas  delineated  by  horizontal  tick  marks  (Fig.  4)  that  are 
important  to  the  axis  scaling  capability:  the  central  focus  area,  and  the  top 
and  bottom  context  areas.  When  the  mouse  is  hovering  over  the  focus  area, 
an  upward  mouse  wheel  motion  expands  the  display  of  the  focus  area  outward 
and  pushes  outliers  to  the  context  areas  (Fig.  5).  A  downward  mouse  wheel 
motion  causes  the  inverse  effect:  focus  region  compression.  Alternatively,  the 
user  may  use  the  mouse  wheel  over  either  of  the  two  context  areas  to  alter 
the  minimum  or  maximum  values  separately.  The  scaling  capability  reduces 
clutter  making  it  easier  to  analyze  relation  lines  of  interest. 


4-2  Aerial  Perspective 


Aerial  perspective  shading  is  useful  for  quickly  monitoring  trends  due  to  the 
similarity  of  data  values  over  multiple  dimensions  in  parallel  coordinates  (Jankun- 
Kelly  and  Waters,  2006).  In  this  implementation,  aerial  perspective  shading 
can  be  used  in  either  a  discrete  or  a  continuous  mode.  In  the  discrete  mode, 
the  lines  are  colored  according  to  the  axis  region  that  they  intersect.  If  any 
point  of  a  relation  line  is  in  the  context  regions  of  at  least  one  axis,  the  line 
is  shaded  with  a  light  gray  color  and  drawn  beneath  the  non-context  lines 
(Fig.  5).  If  all  the  points  on  a  line  fall  within  the  query  area  of  each  axis  (the 
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(a)  Discrete  aerial  perspective  shading. 


Storm  Year  Intense  Hurricanes  February  SST  June-July  SLP 
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23  of  57  lines  selected  (40.35%) 

(b)  Continuous  aerial  perspective  shading. 

Fig.  6.  A  screen  shot  of  the  aerial  perspective  shading  capability  which  can  be  used 
in  either  discrete  (a)  or  continuous  (b)  shading  mode.  The  line  colors  are  determined 
based  on  the  location  of  the  line  with  respect  to  the  context,  focus,  and  query  areas 
of  the  axes  and,  in  continuous  mode,  the  distance  from  the  mouse  cursor  is  encoded 
with  color  value.  In  the  above  examples,  the  mouse  cursor  is  positioned  at  the 
bottom  of  the  second  axis  (the  Intense  Hurricanes  axis)  which  highlights  the  storm 
seasons  with  above  average  intense  hurricane  activity.  The  continuous  shading  mode 
gives  more  emphasis  to  the  lines  representing  the  most  active  seasons. 


area  between  the  two  query  sliders),  the  line  is  colored  using  a  dark  gray  that 
attracts  the  viewer’s  attention  (Fig.  6).  The  remaining  lines  are  color  with  a 
gray  that  is  slightly  darker  than  the  context  lines. 
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In  the  continuous  mode,  non-context  lines  go  through  an  additional  step  to 
encode  the  distance  of  the  line  from  the  mouse  cursor.  Query  lines  that  are 
nearest  to  the  mouse  cursor  are  shaded  with  the  darkest  gray  color  while  lines 
furtherest  from  the  mouse  cursor  are  shaded  with  a  lighter  gray.  The  other 
query  lines  are  shaded  according  to  a  non-linear  fall-off  function  that  yields 
a  gradient  of  gray  colors  between  extremes.  Consequently,  the  lines  that  are 
nearest  to  the  mouse  cursor  are  more  prominent  to  the  viewer  due  to  the  more 
drastic  color  contrast  and  depth  ordering  treatments  (Fig.  6)  giving  the  viewer 
the  ability  to  effectively  use  the  mouse  to  perform  rapid,  visual  queries. 


4-3  Representing  Key  Statistics 


To  support  the  advanced  interaction  capabilities  of  this  application,  each  axis 
also  shows  key  statistical  quantities  for  the  relation  points  that  are  displayed 
in  the  focus  region  (Siirtola,  2000;  Hauser  et  ah,  2002).  For  each  axis,  the 
mean,  standard  deviation,  and  the  frequency  information  are  calculated  for 
points  in  the  focus  area.  As  shown  in  Fig.  2,  the  mean  value  and  the  standard 
deviation  range  are  shown  using  two  yellow  half  circles  and  two  cyan  rect¬ 
angles,  respectively.  Within  each  axis  bar,  the  frequency  information  is  also 
displayed  by  representing  histogram  bins  as  small,  gray  rectangles  with  gray 
values  proportional  to  the  number  of  lines  that  pass  through  the  bin’s  region. 


5  Parallel  Coordinates  Validation:  North  Atlantic  Case  Study 


As  discussed  previously,  regression  analysis  is  often  employed  to  identify  the 
most  relevant  climate  relationships  for  tropical  cyclone  activity.  Such  tech¬ 
niques  are  effective  in  screening  data  and  providing  quantitative  associations. 
However,  multivariate  analysis  can  be  difficult.  This  section  will  outline  how 
stepwise  regression  and  parallel  coordinates  can  compliment  each  other  in  such 
an  analysis. 

Stepwise  regression  with  a  “backwards  glance”  is  used  which  selects  the  opti¬ 
mum  number  of  most  important  variables  using  a  predefined  significance  value 
(90%  in  this  study).  Stepwise  regression  can  compliment  parallel  coordinate 
visualization  by  isolating  the  significant  variables  in  a  quantitative  fashion. 
An  interactive  parallel  coordinates  visualization  can  then  be  used  to  develop 
a  deeper  understanding  of  the  complex  relationships  between  the  variables. 

An  extra  step  is  taken  to  ensure  the  proper  selection  of  variables.  The  initially 
chosen  variables  are  examined  for  multicollinearity;  if  any  variables  are  corre¬ 
lated  with  each  other  by  more  than  0.5,  one  is  removed  and  the  code  rerun. 
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In  this  way,  the  chosen  variables  are  truly  independent  of  each  other. 

A  normalization  procedure  is  also  done  for  equal  comparison  between  the  vari¬ 
ables.  Denoting  o  as  the  standard  deviation  of  a  variable,  y  as  the  dependent 
variable  (named  storms,  hurricanes,  or  intense  hurricanes  in  this  study),  x  as 
the  predictor  mean,  and  y  as  the  dependent  variable  mean,  a  number  k  of 
statistically  significant  predictors  are  normalized  by  the  following  regression: 

k 

(y  -y)/<Ty  =  -^*)M  (!) 

i=  1 


The  advantage  of  this  approach  is  that  the  importance  of  a  predictor  may 
be  assessed  by  comparing  regression  coefficients  ct  between  different  variables, 
and  that  the  y-intercept  becomes  zero. 

In  addition,  Xi  may  be  interpreted  (to  a  first  approximation)  as  a  “threshold” 
value  which  distinguishes  between  positive  and  negative  contributions  (for 
Cj  >  0),  and  the  opposite  for  negative  Cj.  Years  when  independent  variables 
contain  large  deviations  from  the  mean  could  be  associated  with  very  active 
or  inactive  years,  and  require  closer  examination.  As  will  be  seen,  the  parallel 
coordinates  technique  facilitates  the  examination  of  active  and  quiet  Atlantic 
hurricane  seasons. 

The  16  potential  variables  listed  in  Table  2  are  examined  in  the  stepwise  re¬ 
gression,  yielding  several  independent  variables  for  each  dependent  variable. 
These  results  show  that  several  climate  factors  impact  tropical  cyclone  activ¬ 
ity.  The  chosen  predictors  are  shown  in  Table  3,  along  with  their  normalized 
regression  coefficient  and  sample  mean.  The  explained  variance  ( R 2)  is  shown 
in  the  3  table  headings. 

The  stepwise  regression  shows  only  one  significant  El  Nino  variable  (late  win¬ 
ter  South  Indian  Ocean  200-mb  meridional  winds  (4))  impacts  total  number 
of  storms;  it  is  the  second  most  influential  predictor.  Late  winter  northwest 
coastal  European  SST  (14)  is  the  leading  predictor.  The  North  Atlantic  Oscil¬ 
lation  (manifested  by  500-mb  geopotential  height  in  the  North  Atlantic  (12)) 
ranks  third,  and  is  also  the  only  variable  seen  in  all  three  tables.  This  suggests 
that  the  presence  of  a  ridge  in  the  Atlantic  is  conducive  to  an  above  average 
tropical  cyclone  season.  Finally,  low  SLP  in  the  southeast  Gulf  of  Mexico  (11) 
also  encourages  the  formation  of  tropical  cyclones.  Note  that  the  coefficient 
has  a  negative  sign,  showing  that  the  lower  the  pressure,  the  better  the  chance 
of  tropical  cyclone  activity. 

For  number  of  hurricanes,  the  analysis  surprisingly  shows  that  October-November 
SLP  in  the  Gulf  of  Alaska  (6)  is  the  most  important  predictor.  The  physical 
role  is  not  clear,  although  scientists  know  it  is  correlated  to  El  Nino  activity. 
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Northeast  subtropical  Atlantic  SST  (16)  and  North  Atlantic  500-mb  geopoten¬ 
tial  height  (12)  are  tied  for  second,  and  southeast  Gulf  SLP  again  ranks  fourth 

(11) .  The  explained  variance  is  42%  —  more  than  the  34%  for  named  storms. 
This  suggests  stronger  predictor  relationships  for  number  of  hurricanes. 

For  intense  hurricanes,  the  variance  increases  to  54%.  In  this  case,  the  North 
Atlantic  November  500-mb  height  variable  (12)  is  the  strongest  predictor. 
Early  summer  tropical  Atlantic  SLP  (10)  ranks  number  two,  followed  by 
September  500-mb  geopotential  height  in  western  North  America  (7)  and 
February  SST  off  northwest  coastal  Europe  (14).  The  higher  variance  and  dis¬ 
tinctly  different  chosen  predictors  suggests  different  environmental  influences 
are  required  for  intense  hurricanes.  This  analysis  correlates  the  presence  of 
high  pressure  in  the  western  U.S.  and  over  the  Atlantic,  low  summer  Atlantic 
SLP,  and  warm  SST  as  necessary  conditions  for  intense  hurricanes. 

Because  there  is  unexplained  variance  and  several  predictors,  can  parallel  co¬ 
ordinates  glean  any  more  information?  To  answer  this  question,  the  datasets 
are  stratified  into  below  normal,  normal,  and  above  normal  seasons  using  the 
software’s  interactive  capabilities,  and  the  significant  predictors  identified  by 
the  stepwise  regression  are  analyzed  visually.  Using  the  key  statistical  indi¬ 
cators,  the  below  normal,  normal,  and  above  normal  seasons  are  determined 
by  moving  the  query  sliders  for  the  axis  of  interest  to  encapsulate  the  lines 
above  the  standard  deviation  range,  within  the  standard  deviation  range,  and 
below  the  standard  deviation  range,  respectively.  After  setting  the  query  slid¬ 
ers,  the  aerial  perspective  shading  highlights  the  relationships  of  interest,  thus 
enabling  analysis  of  the  variables. 

Figure  7  shows  a  plot  for  seasons  with  below  normal  named  storms  (sample 
size  of  16).  Even  though  the  regression  shows  February  Atlantic  SST  (14) 
as  the  most  important  overall  predictor,  it  is  not  as  effective  for  discerning 
inactive  seasons.  The  plot  shows  considerable  scatter,  and  with  only  6  years 
of  significantly  below  average  SST.  The  dynamic  query  capabilities  of  this 
parallel  coordinates  application  make  these  combined  queries  and  subsample 
analysis  an  intuitive  exercise.  September-November  Gulf  of  Mexico  SLP  (11) 
also  exhibits  much  scatter,  with  a  slight  majority  of  years  with  above  normal 
pressure.  However,  February-March  200-mb  South  Indian  Ocean  meridional 
winds  (4)  —  a  surrogate  measurement  of  El  Nino,  shows  15  seasons  (94%) 
of  strong  north  winds,  tightly  clustered  in  the  plots.  This  suggests  El  Nino 
is  the  major  contributor  to  inactive  Atlantic  tropical  cyclone  seasons.  Note 
also  that  below  normal  November  North  Atlantic  500-mb  geopotential  heights 

(12)  plays  a  pivotal  role  for  quiet  seasons.  Fourteen  seasons  (87%)  contain 
lower  geopotential  heights  in  November,  suggesting  the  presence  of  upper- 
level  troughs  which  can  shear  tropical  cyclones.  However,  this  signal  is  not  as 
strong  as  the  El  Nino  predictor.  Additionally,  many  unshaded  lines  exist  for 
positive  200-mb  V,  showing  that  other  factors  besides  El  Nino  contribute  to 
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Hurricanes  October  November  SLP  June-JulySST  November  500MB  September  November  SLP 

Geopotential  Height 
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Intense  Hurricanes  November  500MB  June-JulySLP  September  500MB  February  SST 
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normal  and  active  seasons.  In  fact,  a  similar  parallel  coordinates  stratification 
analysis  shows  that  November  North  Atlantic  500-mb  geopotential  heights 
(12)  and  September-November  Gulf  of  Mexico  SLP  (11)  tend  to  be  the  critical 
players  for  an  active  tropical  cyclone  season  (not  shown). 

Figure  8  shows  seasons  with  below  normal  hurricane  activity  (19  seasons). 
El  Nino  again  tends  to  dominate  the  signal  through  the  fall  Gulf  of  Alaska 
SLP  (6)  term.  However,  in  contrast  to  number  of  named  storms,  Atlantic 
SST  (16)  becomes  important  for  number  of  hurricanes.  This  suggests  that 
when  water  temperature  is  below  normal,  tropical  storms  will  have  difficulty 
reaching  hurricane  status.  For  above  normal  hurricane  activity  (Fig.  9),  June- 
July  Atlantic  SST  (16),  November  North  Atlantic  500-mb  geopotential  height 
(12),  and  Gulf  of  Mexico  SLP  (11)  tend  to  exert  dominant  roles,  with  El  Nino 
a  secondary  factor. 

Intense  hurricanes  warrant  special  consideration,  since  they  cause  80%  of  the 
economic  damage  from  tropical  cyclones.  Figure  10  shows  that  cold  February 
Atlantic  SSTs  (14)  and  high  Atlantic  June-July  SLP  (10)  tend  to  reduce  the 
number  of  intense  hurricanes,  with  November  North  Atlantic  500-mb  geopo¬ 
tential  heights  (12)  playing  a  secondary  role  and  September  500-mb  geopo¬ 
tential  heights  in  western  North  America  (7)  contributing  no  role.  In  contrast, 
all  four  predictors  have  tightly  clustered  lines  showing  they  all  play  dominant 
roles  in  seasons  with  above  normal  intense  hurricane  activity  (Fig.  11).  These 
terms  are  associated  with  the  presence  of  ridges  in  the  western  LI.S.  and  the 
Atlantic,  below  average  Atlantic  SLP,  and  warm  wintertime  Atlantic  SST  off 
the  northwestern  European  Coast.  Ridges  are  low  shear  environments,  show¬ 
ing  that  the  lack  of  upper  level  troughs  is  an  important  factor  for  seasons  with 
many  intense  hurricanes.  Low  SLP  indicates  minimal  subsidence.  Sinking  air 
suppresses  cloud  growth  and  also  dries  the  lower  atmosphere,  both  of  which 
are  not  conducive  to  the  formation  and  development  of  tropical  cyclones.  Low 
SLP  also  could  indicate  better  organized  tropical  waves  (from  which  many 
Atlantic  tropical  cyclones  form).  Warm  wintertime  northeast  Atlantic  water 
also  is  a  good  precursor  for  above  average  intense  hurricane  activity. 

This  parallel  coordinates  application  can  also  investigate  the  differences  be¬ 
tween  the  extremely  busy  2005  season  and  the  slightly  below  average  2006 
season.  Figure  12  shows  the  2005  and  2006  seasons  along  with  the  chosen 
predictors  from  all  three  categories  (named  storms,  hurricanes,  and  intense 
hurricanes)  listed  in  Table  3.  This  plot  reveals  that  most  of  the  terms  are 
nearly  the  same  except  for  October-November  SLP  in  the  Gulf  of  Alaska  (6) 
(above  average  in  2005,  below  average  in  2006)  and  June-July  SLP  in  the  trop¬ 
ical  Atlantic  (10)  (below  average  in  2005,  above  average  in  2006).  Klotzbach 
et  al.  (2006b)  andBcll  et  al.  (2007)  show  that  the  tropical  Atlantic  was  quite 
dry  through  most  of  the  2006  hurricane  season  due  to  subsidence  associated 
with  the  onset  of  an  unusually  late  ENSO  event  (indicated  by  the  Gulf  of 
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Alaska  SLP),  as  well  as  frequent  outbreaks  of  African  dust  storms  that  year. 


6  Conclusion 


It  has  been  shown  that  parallel  coordinates,  a  visualization  technique  designed 
specifically  for  multivariate  information,  can  be  used  to  confirm  and  clarify  the 
results  of  stepwise  regression  when  enhanced  with  interactive  tools.  The  added 
capabilities  discussed  in  this  paper  include  focus+context  filtering,  dynamic 
visual  queries  with  sliders,  statistical  displays,  relocatable  axes,  axis  inversion, 
details-on  demand,  a  pop-up  menu  interface,  and  aerial  perspectives.  An  ap¬ 
plication  to  a  tropical  cyclone  dataset  shows  that,  while  multiple  regression 
provides  the  most  significant  variables,  visual  analysis  using  a  dynamic  paral¬ 
lel  coordinates  system  facilitates  a  deeper  understanding  of  the  environmental 
causes  for  above  average  and  below  average  hurricane  seasons. 
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Table  3 

Significant  climate  variables  chosen  from  Table  2  by  the  stepwise  regression  for 
number  of  named  storms,  hurricanes,  and  intense  hurricanes  in  1950-2006.  Also 
shown  is  the  explained  variance  R2,  the  normalized  coefficients  c,  and  the  sample 
mean. 

Number  of  Named  Storms 
(R2  is  34%) 


Chosen  Variables 

Normalized 

Coefficients  c 

Sample  Mean 

Feb.  SST  (14) 

0.302 

13.8 

Feb. -Mar.  200-mb  V  (4) 

-0.244 

2.5 

Nov.  500-mb  Geopot.  Ht.  (12) 

0.232 

5213 

Sep.-Nov.  SLP  (11) 

-0.175 

1015.0 

Number  of  Hurricanes 

(. R 2  is 

42%) 

Chosen  Variables 

Normalized 

Coefficients  c 

Sample  Mean 

Oct.-Nov.  SLP  (6) 

-0.284 

1009.6 

June-July  SST  (16) 

0.259 

22.2 

Nov.  500-mb  Geopot.  Ht.  (12) 

0.258 

5213 

Sep.-Nov.  SLP  (11) 

-0.208 

1015.0 

Number  of  Intense  Hurricanes 

(. R 2  is 

54%) 

Chosen  Variables 

Normalized 

Coefficients  c 

Sample  Mean 

Nov.  500-mb  Geopot.  Ht.  (12) 

0.345 

5213 

June-July  SLP  (10) 

-0.315 

1016.2 

Sep.  500-mb  Geopot.  Ht.  (7) 

0.292 

5753.3 

Feb.  SST  (14) 

0.235 

13.8 
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